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Secondary structure formation of nucleic acids strongly depends on salt concentration and tem- 
perature. We develop a theory for RNA folding that correctly accounts for sequence effects, the 
entropic contributions associated with loop formation, and salt effects. Using an iterative expres- 
sion for the partition function that neglects pseudoknots, we calculate folding free energies and 
minimum free energy configurations based on the experimentally derived base pairing free energies. 
The configurational entropy of loop formation is modeled by the asymptotic expression — clnm, 
where m is the length of the loop and c the loop exponent, which is an adjustable constant. Salt 
effects enter in two ways: first, we derive salt induced modifications of the free energy parameters 
for describing base pairing and, second, we include the electrostatic free energy for loop formation. 
Both effects are modeled on the Debye-Hiickel level including counterion condensation. We validate 
our theory for two different RNA sequences: For tRNA-phe, the resultant heat capacity curves for 
thermal denaturation at various salt concentrations accurately reproduce experimental results. For 
the P5ab RNA hairpin, we derive the global phase diagram in the three-dimensional space spanned 
by temperature, stretching force, and salt concentration and obtain good agreement with the ex- 
perimentally determined critical unfolding force. We show that for a proper description of RNA 
melting and stretching, both salt and loop entropy effects are needed. 



I. INTRODUCTION 



Ribonucleic acid (RNA) is one of the key players in molecular biology and has in the past attracted theoretical and 
experimental physicists because of its intriguing structural and functional properties. RNA has multiple functions: 
beyond being an information carrier it has regulatory and catalytic abilities (D^ . Comprehending how RNA folds and 
what influences the folding process are key questions [2]. Thus, the reliable prediction of RNA structure and stability 
under various conditions is crucial for our understanding of the functioning of RNA and nucleic acid constructs in 
general [311]. 

The influence of temperature and solution conditions on RNA folding stays in the interest of experimental groups. 
Traditionally the thermal melting of RNA was monitored via differential scanning calometry or UV spectroscopy for 
the bulk ensemble [SHE]. More recently, single molecule pulling and unzipping experiments have been used to unveil 
the influence of different solution conditions and even determine energy parameters [^llj . 

On the theoretical side, RNA denaturation has been modeled on various levels of coarse graining. Focusing on the 
secondary structure, namely the base pairs (bp), and omitting tertiary interactions, equilibrium folding and unfolding 
has been modeled very successfully [T2H12]- In the presence of a logarithmic contribution to the loop entropy, it 
has been shown that homopolymeric RNA, where sequence effects are neglected, features a genuine phase transition, 
which can be induced by force or temperature 18. 21 - 23J . However, the specific sequence influences the stretching 
response of a molecule, which has been shown by Gerland et al. [THl [M] , yet without considering the logarithmic loop 
entropy. More detailed insights can be obtained by simulations, which are numerically quite costly, though, when 
compared to models focusing only on secondary structure. Coarse grained, Go-like simulations of short RNA hairpins 
allowed to analyze the dynamics of the folding and unfolding process 25, 25]. Ion specific effects have been studied 
by performing molecular dynamics |27j or coarse grained simulations [281131) . Much less is known about the salt 
dependence of denaturation transitions of RNA. 

While for DNA numerous corrections of the base pairing free energies due to varying salt concentration exist, see |32j 
and references therein, analogous results for the salt dependence of RNA energy parameters are sparse (28j. However, 
molecular biology and biotechnological applications depend on the reliable prediction of RNA stability for different 
solution conditions. 

In this paper we extend these previous works and develop a theory that allows to include all these effects - sequence, 
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Figure 1: Schematic representation of the secondary structure of an RNA molecule. Dots represent one base, i.e. cytosine, 
guanine, adenine, or uracil. Solid lines denote the sugar-phosphate backbone bonds, broken lines base pairs, and thick gray 
lines the non-nested backbone bonds, which are counted by the variable M, here M — 11. The thick arrows to either side 
illustrate the force F applied to the 5'- and 3'-end. 

salt dependence, logarithmic loop entropy, stretching force - and demonstrate that all are necessary to obtain a 
complete picture of the thermodynamics of the secondary structure of RNA. Neglecting tertiary interactions, we use 
a recursion relation, which allows to correctly account for logarithmic and thus non-linear free energy contributions 
due to the configurational entropy of loops [H]. To include the influence of monovalent salt on RNA stability, we 
model the RNA backbone as a charged polymer interacting via a Debye-Hiickel potential and give heuristic formulas 
for the modification of the loop free energy and the base pairing and stacking free energy parameters. Debye-Hiickel 
is a linear theory, yet we include non- linear effects caused by counterion condensation using Manning's concept |33) . 
The backbone elasticity of single stranded RNA (ssRNA) is described by the freely jointed chain (FJC) model. Our 
description allows for a complete description of the behavior of RNA in the three-dimensional phase space spanned 
by temperature, salt concentration, and external stretching force. We find that for an improved description of RNA 
melting curves one needs to include both salt effects and loop entropy. Only the combined usage of these two 
contributions enables to predict the shift of the melting temperature (due to salt) and the cooperativity (due to 
logarithmic loop entropy), which is illustrated in the case of tRNA-phe. As an independent check we consider the 
force induced unfolding of the P5ab RNA hairpin and observe good agreement with experimental values with no 
fitting parameters. The influence of salt is illustrated by melting curves and force extension curves for various salt 
concentrations. For the P5ab hairpin the phase diagram is determined and slices through the three-dimensional 
parameter space are shown. 

II. FREE ENERGY PARAMETERIZATION 

RNA folding can be separated into three steps, which occur subsequently and do not influence each other to a 
fairly good approximation [34,. The primary structure of RNA is the mere sequence of its four bases cytosine (C), 
guanine (G), adenine (A), and uracil (U). Due to base pairing, i.e. either the specific interaction of C with G or the 
interaction of A with U, the secondary structure is formed. Therefore, on an abstract level, the secondary structure 
is given by the list of all base pairs present in the molecule. Only after the secondary structure has formed, tertiary 
contacts arise. Pseudoknots [35l[36], helix stacking, and base triples [37] as well as the overall three-dimensional 
arrangement of the molecule are considered as parts of the tertiary structure. The main assumption of hierarchical 
folding is, that tertiary structure formation operates only on already existing secondary structure elements |34) . 
Although cases are known where this approximation breaks down, it generally constitutes a valid starting point |38j . 
In this paper, where the main point is the influence of the loop entropy and the salt concentration on the secondary 
structure, we therefore neglect tertiary interactions altogether. 

Given a set of base pairs, the secondary structure consists of helices and loops as the basic structural units, cf. fig.[l] 
Since pseudoknots are neglected, every nucleotide can be attributed unambiguously to exactly one subunit. The free 
energy of a certain secondary structure is then given by the sum of the free energy contributions of the individual 
structural subunits, as we will detail now. 
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A. Free energy of a loop 



We model the free energy of a loop consisting of m backbone bonds, see fig. [T] with 

Gi (m) = or' (m) + or'' (m) + gr' . (i) 

The first term is the loop entropy difference between an unconstrained polymer and a ring-like polymer, which is 
characterized by the loop exponent c [2]j [23] |3SJ HP] 

ef™f(m) = -kB^lnm-^ (2) 

with ke the Boltzniann constant and T the absolute temperature. The loop exponent c is Cidcai = 3/2 for an ideal 
polymer and csaw — 1-76 for an isolated self avoiding loop. Helices emerging from the loop limit the configurational 
space available to the loop and hence increase c. One obtains ci = 2.06 for terminal, C2 — 2.14 for internal loops and 
C4 = 2.16 for a loop with four emerging helices |21j . Since the differences between these exponent values are quite 
small, we assume a constant loop exponent c = 2.1 in this paper and only compare with the case of vanishing loop 
entropy characterized by c = 0. 

The second term in eq. ([T]) describes the free energy difference between a charged ring of length miss and a straight 
rod of the same length due to electrostatic interactions, with = 6.4 A the length of one ssRNA backbone bond |30j . 
The electrostatics are modeled on the Debye-Hiickel level [H] 
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with Ib — Bq/ (kBT47reo£r) the Bjerrum length, which in water has a value of roughly 7 A, = ^ £o£rkBT/ (2NAeQ/) 
the Debye screening length, eg the vacuum dielectric constant, ~ 80 the relative dielectric constant of water |42j . 
/ = l/2(pa-Za + Pc-^c) the ionic strength, Pa/Pc and z^j the concentration and the valency of the anions/cations, Na 
the Avogadro constant, eg the elementary charge, 7 ~ 0.58 Euler's constant, r(a, x) the incomplete gamma function, 
and pFg the generalized hypergeometric functions [45] . To account for modifications of the line charge density Tss due 
to non-linear electrostatic effects, we employ Manning's counterion condensation theory [33] , predicting 

Tss = min(l//gg, 1/(/bZc)) ■ (4) 

Eq. ([3| amounts to a ground state approximation of the electrostatic contribution to the free energy of a loop. This is 
rationalized by the fact that the electrostatic interaction is screened and decays exponentially over the Debye length, 
which is roughly kT^ — \ nm for 100 mM salt solution. However, typical distances between bases in a loop are of 
the order of the helix diameter d = 2 nm or larger. Therefore, we expect electrostatic interactions to be basically 
independent of the global configuration of a loop, which justifies both the ground state approximation and our 
additivity approximation, where ion effects and conformational contributions decouple, see eq. ([T]) . In the supporting 
material, see eq. S4, we give an interpolation formula for eq. ^ involving no hypergeometrical functions. 

The last term in eq. (!]) is the loop initiation free energy ^j'"'*. As we are employing a logarithmic loop entropy, 
eq. ([2]), we cannot use the standard value for t/J"'*, which was extracted from experimental data for a different loop 
parameterization fSl|Sl. Therefore, a modified value CJJ'"' is obtained by fitting Qi{m), given by eq. ([T]), to experimental 
data using c = 2.1 in t/™"*(m) and the salt concentration p = 1 M in t/f^'*(m), see fig. [2^. In this figure we show 
experimentally determined free energies for terminal, internal and bulge loops as a function of the loop size, which 
exhibit a dependence on the type of the loop. As an approximation, we do not distinguish between those loop types 
in the theory and consequently fit a single parameter tj^""* to the data, which turns out to be — 1.9 kcal/mol for 
T = 300 K, see supporting material section C. In fig. [2i the fitted Q\{m) for the loop exponent c = 2.1 is depicted 
by the solid line; the other lines illustrate the effect of different loop exponents on the loop free energy according to 
eq. ([1]) using the same value for C/J"''. fig. ^ illustrates the effect of salt on the loop free energy for a given value of 
c = 2.1. 
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Figure 2: (a) Free energy of a loop as a function of the number of segments m for different loop exponents c = 0, 1.5, 2.1 
(lines) and for NaCl concentration p = 1 M. Symbols denote experimental values for various types of loops (hairpin, bulge, 
internal) [301 for p = 1 M NaCl. Sf"'* is obtained by fitting Qi{m), eq. ([T|, to the experimental data for c = 2.1 and p = 1 M. 
The same salt concentration and the same value for C/J"'' is used for plotting the curves with c — 0, 1.5. (b) Salt dependence 
of the free energy of loops as a function of the number of segments m for different salt concentrations p = 1 M, 0.1 M, 0.01 M 
and loop exponent c = 2.1 according to eq. M. 



B. Free energy of a helix 

The free energy of a helix 

= aif"^"^ + gr' + Gir"" + ^r" (5) 

depends on the sequence {h}, which consists of the four nucleotides bi = C, G,A,U. The stacking free energy Q^^'^^ 
is based on experimentally determined parameters incorporating the base pairing free energy as well as the stacking 
free energy between neighboring base pairs. In the standard notation, g^'^'^^[{bi,bj), (6^+1, is the contribution 

of the two neighboring, stacked base pairs {bi,bj) and {bi+i,bj-i) to G^^'^^. The explicit values for the enthalpic and 
entropic parts are given in the supporting material. We use the expanded nearest neighbor model E] to calculate 
the base pairing and stacking contributions of a helical section ranging from base pair (i, j) through (i + h,j — h) and 
obtain 



^stack ^ ^ g-^-<^\b,^,,_,,b,^„^,)^ {h+H',b,^w)] ■ (6) 
h' = l 



The initiation and termination free energies in eq. ^ take into account weaker pairing energies of AU or GU base 
pairs at the ends of the helix. We use the standard literature values for Q™^^ and Q\^™ [5j [6] and summarize the 
explicit values in the supporting material. Increasing the salt concentration increases the stability of a helix: First, 
counterions condense on the negatively charged backbone and reduce the electrostatic repulsion and, second, the 
diffuse counterion cloud surrounding the charged molecule screens the interaction. We model the two strands of a 
helix as two parallel rods at distance d = 2 nm interacting via a Debye-Hiickel potential characterized by the screening 
length kT^ . The electrostatic interaction energy per nucleotide with the other strand is given by 

9h^ip) ='^bTtIJ^Jq J ^^^''J^J^^ ^dz = . (7) 

Ids = 3.4 A is the helical rise per base pair of double-stranded RNA (dsRNA) and Ko(K(i) is the zeroth order modified 
Bessel function of the second kind. Again, we employ Manning's theory [33] to calculate the line charge density t^s — 
min(l/Zds, l/(^B^c))- The reference state for the salt correction of the pairing free energy is at temperature T = 300 K 
with monovalent salt concentration p = 1 M, as the experimental pairing free energies g^'^'^^ were determined at this 
concentration. The free energy shift for a helix consisting of h base pairs due to electrostatic interactions is then 

^^r'*-M3r(p)-.9r(iM)). (8) 

The use of Debye-Hiickel theory to incorporate salt effects enables to include the overall dependence on temperature 
and salt concentration but involves several approximations. First, we are using Manning's counterion condensation 
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theory to obtain the actual line charge density of ssRNA and dsRNA [53]. However, Manning condensation is 
known to underestimate the line charge at increasing salt concentration and therefore favors the bound state [45]. 
Second, when calculating the electrostatic energy of a foop we effectively use a ground state approximation and 
neglect conformational fluctuation effects. Third, when two ssRNA strands come together to form a helix, the line 
charge density increases since the distance between two bases decreases. The salt dependence of the work to decrease 
the axial distance between two bases from Igs — 6.4 A to Ids — 3.4 A is neglected. This approximation favors the 
unbound state. Therefore, it is very important to validate the model we employ, which we do by detailed comparison 
with experimental data. From the favorable comparison with experiments we tentatively conclude that the various 
errors partially cancel and the resulting expression for the salt influence is quite accurate. We point out that after 
determining in eq. ([l]), no further fitting is done and only standard literature values are used. 

Our theory is able to consider variations of the salt concentration as well as of the temperature, which makes it 
suitable to study RNA melting at various salt concentrations in a consistent way. However, since our approach is 
solely based on mean field theory, it will become unreliable in the case of multivalent ions, where correlations become 
important. Also, ion specific effects, which are important for divalent ions such as Mg^"*" [46], are not considered in 
our approach. 



C. Response of the molecule to an external stretching force 



In atomic force microscope or optical tweezers experiments, it is possible to apply a stretching force F to the two 
terminal bases of the molecule. We model the stretching response of the M non-nested backbone bonds, see fig. [T] 
with the freely jointed chain (FJC) model [TH [ITl [24| . A non- nested bond is defined as a backbone bond, which is 
neither part of a helix nor part of a loop. It is outside all secondary structure elements and therefore contributes to 
the end-to-end extension observed in force spectroscopy experiments. The force dependent contribution to the free 
energy per non-nested monomer is given by 

where /3 = l/(kBr) is the inverse thermal energy and fogs = 1-9 nm is the Kuhn length of ssRNA [T^ (we used the 
Kuhn length of ssDNA as the corresponding ssRNA data is less certain) . The stretching response of one non- nested 
monomer to an external force is then given by 

x^JC ^ _ ^ IssCiPFbss) = ks (coth(/3F6,,) -I- l/i/3Fhs)) , (10) 

£ is the Langevin function. Electrostatic effects on the stretching response are considered to be small and hence are 
neglected [i71B5] . 



III. CALCULATION OF THE PARTITION FUNCTION 



So far we showed how to calculate the free energy of one given secondary structure. The next step is to enumerate 
all possible secondary structures and to obtain the partition function, which allows to study the thermodynamics of 
the system. As we neglect tertiary contacts - and in particular pseudoknots - for any two base pairs and (fc, I) 
with i < j, k < I, and i < k we have either i<k<l<j or i<j<k<l. This allows to derive a recursion 
relation for the partition function of the secondary structure. In our notation, the canonical partition function Q*^ of 
a sub-strand from base i at the 5'-end through j at the 3'-end depends on the number of non-nested backbone bonds 
M [ini mi [53] , see fig-[l] The recursion relations for Q*^- can be written as 



vijM + l) 
Vf{M) 



1+ E 



Qi!k-lQk. 



k=i+M+l 



(11a) 



and 



Ql 
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Figure 3: Melting curve of the 76 bases long tRNA-phe of yeast; the minimum free energy structure at p = 1 M, T = 300 K, 
c = 2.1 is shown as an inset. Symbols denote experimental melting curves for NaCl concentrations p = 20 mM (squares) 
and 150 mM (circles) [7]. Our predictions for different salt concentrations are depicted by the dashed (20 mM), dash-dotted 
(150 mM), and solid (1 M) lines. The respective arrows indicate melting temperatures obtained by experiments of another 
group ^ at the same salt concentration p = 150 mM (left arrow) and p = 1 M (right arrow). The dotted line shows our 
prediction for p = 150 mM and c — and exemplifies that a non-zero loop exponent is responsible for rendering the transition 
more cooperative, in closer agreement with experiment; for 150 mM and c = the melting temperature is at higher temperatures 
since the energy parameters are optimized for c — 2.1. The gray dash-dotted curve is the prediction of the Vienna package, 
which uses a linearized multi-loop entropy corresponding to c = and p = 1 M. This is to be compared to our prediction for 
c = 2.1 and p = 1 M: while the melting temperatures are similar, the cooperativity, i.e. the widths of the peaks are different 
due to different loop exponents. 



Eq. (11a I describes elongation of an RNA structure by either adding an unpaired base (first term) or by adding an 



arbitrary sub-strand that is terminated by a helix. Eq. (lib I constructs Ql by closing structures with m 



non-nested bonds, summed up in j_f^, by a hehx of length h. iVioop = 3 is the minimum number of bases in a 

terminal loop. V{{M) denotes the number of configurations of a free chain with M links and drops out by introducing 
the rescaled partition function Qfj = Qf^j/vf{M) and will not be considered further since its effects on the partition 
function are negligible. ^/h(^+^^-^^_i_^-) is the free energy of a helix beginning with base pair {k,j + 1) and ending with 
base pair {k + h,j + 1 — h) according to eq. Qi{m + 2) is the free energy of a loop consisting of m -I- 2 segments 
as given by eq. ([T]). Qi and Gh contain all interactions discussed in the previous section. Eq. (11) allows to compute 



the partition function in polynomial time {0{N^)). Further, our formulation allows to treat non-linear functions for 
Gi{m) and Ghih); for instance, Gi{m) is strongly non-hnear by virtue of eqs. ^ and ([3|. 

The unrestricted partition function of the entire RNA, where the number of non-nested backbone bonds M is 
allowed to fluctuate, is given by 



N 



J2 exp[-/35f"^M]Qo^^ (12) 



M=0 



and contains the influence of force via gf'^'^ defined in eq. ([9]). The partition function Zj^ contains all secondary 
structure interactions, but neglects pseudoknots and other tertiary interactions. As has been argued before, this 
approximation is known to work very well [34 and yields reliable predictions for the stability of nucleic acids |49j . 

Using the same ideas, we determine the minimum free energy (mfe) and the mfe structure. The mfe structure, is 
defined as the secondary structure, which gives the largest contribution to the partition function. Since it cannot be 
derived from the partition function itself, it has to be determined from a slightly modified set of recursion relations, 
see supporting material. 



IV. SALT DEPENDENCE OF MELTING CURVES 



In this section we calculate melting curves for different salt concentrations by applying eqs. ([T]) and ([5]), which 
include our salt dependent free energy parameterization. In fig. |3]we compare experimental results [7, 8 with our 
predictions for the heat capacity of yeast tRNA-phe; the sequence is given in the supporting material section D. The 
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heat capacity is readily obtained by 

C.T^^^f^, (13) 



where Z^f is the unrestricted partition function of the RNA at zero force, eq. (12). In all our calculations, we use 
the same literature parameter set for the stacking and pairing free energy o^*'*'^'*^. No additional fit parameter enters 
except the loop initialization free energy 5J"'*, which is determined in fig. [2k from a separate experimental data set. 
The salt dependence of the experimentally observed melting temperatures is reproduced well, compare fig. |3] The 
arrows indicate additional experimental results [8] for the melting temperature for p = 150 mM and 1 M, which again 
coincide with our prediction. We also plot a calculated melting curve for loop exponent c = and NaCl concentration 
p = 150 mM, which exhibits a far less cooperative transition than observed in the corresponding curve with c = 2.1. 
Finally, we compare our prediction for p = 1 M and c = 2.1 with the prediction of RNAheat in the Vienna Package |50] 
for p = IM, which uses a linearized multi-loop entropy amounting to c = in our framework. The predicted melting 
temperatures are almost identical. However, the widths of the peaks in both melting curves differ and our melting 
profile for c = 2.1 is more peaked. Taking all these observations together leads to the conclusion that only a combined 
use of logarithmic loop entropy (characterized by a non-zero loop exponent) and salt dependent free energy corrections 
leads to a correct prediction of melting curves. The additional features in the experimental data, e. g. the shoulder 
at lower temperatures and the increased width of the experimental curves might be attributed to tertiary structure 
rearrangements, which are not captured by our approach, or to melting occurring in multiple stages. 



V. SALT DEPENDENCE OF STRETCHING CURVES 



Apart from temperature, force is an important variable to study denaturation of RNA molecules [Ui n31[T7l[^[^[5T| - 
[57] , In fig. [4] we show the salt dependence of stretching curves for yeast tRNA-phe. The stretching curves have been 
obtained by describing the force response of the M non-nested backbone bonds, see fig. [T] with the freely jointed 
chain (FJC) model, see eq. (10 1, 



xiF) = keT 



dlnZ 



N 



^kBT 



ainZAT^gfJC 



OF a^fJC Qp 

where we used the expectation value of the number of non-nested backbone segments 
^dlnZN 



M 



-knT- 



(14) 



(15) 



As for the melting curves, one observes that increasing salt concentration stabilizes the structure, leading to higher 
unfolding forces. All curves converge in the large force limit to a freely jointed chain of the length of the whole RNA 
molecule {N — l)lss, where iV = 76 is the number of bases in the chain. The deviation for small forces from this 
theoretical prediction is due to the secondary structure of RNA, which is present at small forces and which becomes 
disrupted at forces F > 3-7 pN. In fig. [5^ we show the force extension curve of the P5ab hairpin [5]; the sequence is 
given in the supporting material section D. Apart from the salt dependence of the force extension curve, one observes 
that the unzipping of the helix occurs in two stages. This is seen best by considering the fraction of non-nested 
segments and its derivative, fig. [5}d. The first stage is a smooth unzipping of the first three base pairs up to the bulge 
loop visible as a shoulder aX F 8pN in the derivative. The second stage is a sharp transition, where the rest of 
the hairpin unzips. In fig. [5]; we show mfe predictions for the secondary structure at different forces for p = 1 M 
NaCl. For F <b pN, we predict correctly the experimentally observed native state with all base pairs intact [3]. For 
forces F w 8 pN, an intermediate state appears, where the first three base pairs are unzipped up to the bulge loop. 
Denaturation is observed for F > 14 pN. The native structure of the P5ab hairpin contains the stacked pairs 
- bp(17,42) and bp(18,41) [9]. For this stack, no free energy parameters are available and we use the parameters for the 
stack , instead. However, other parameterizations for this stack work equally well and reproduce the experimental 
transition force within errors, see fig. [6j 



VI. PHASE DIAGRAMS OF RNA HAIRPIN P5AB 



With the tools established in the previous sections, we are now able to study phase diagrams of RNA. We consider 
the P5ab hairpin, which is a well studied system [SI [211 [IS [SSI [Hj. In fig. [7]3 the phase diagram in the F-p plane is 
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Figure 4: Salt dependence of stretching curves of tRNA-phe for different salt concentrations p — 20 mM, 150 mM, 1 M. 
Increasing salt concentration stabilizes the secondary structure due to screening of the electrostatic interaction. The dotted 
line is the theoretical prediction for the force extension curve of a freely jointed chain, eq. ( |10[ ). The deviation of the predicted 
curves for RNA from the FJC curve is due to the presence of secondary structure. The observed plateau force is due to the 
rupture of the secondary structure. We show the force extension curves in (a) a linear plot and (b) a double logarithmic plot, 
indicating that the force extension curve is linear in the low-force regime, before the secondary structure is ruptured apart. 
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Figure 5: (a) Salt dependence of stretching curves of the 56 bases long RNA hairpin P5ab ^Qj for different salt concentrations 
p = 20 mM, 150 mM, 1 M. Increasing salt concentration stabilizes the sec ond ary structure due to screening of the electrostatic 
interaction. The dotted line is the force extension curve of an FJC, eq. (10 1, (b) The fraction of non-nested segments M/N 
as a function of force. One observes that the unzipping of the hairpin occurs in two stages, which is visible as a shoulder for 
5 pN F < Q — IQ pN (exact values depend on the salt concentration) and a successive cooperative transition. The inset 
shows the derivative AM/(NAF) for p = 1 M, where the first transition is visible as a shoulder at _F ~ 8 pN. The sharp peak at 
F = 11 pN is the rupture of the complete helix, (c) Predicted minimum free energy structures of the hairpin P5ab at different 
forces, see also the supporting material section A. For F < 5 pN the hairpin is in the native state with all base pairs intact. 
At F « 8 pN the first helix, consisting of three base pairs and bounded by the bulge loop, is ruptured. This causes the first 
smooth transition. Forces F > 14 pN lead to the unzipping of the whole hairpin in a very cooperative fashion. 



shown for T — 298 K, 300 K, 320 K and c — 2.1. The phase boundary is defined as the force where half of the helical 
section is unzipped. For the definition of the phase boundary, we exclude the three unpaired bases at the 5'- and 
the four bases at the 3'-end, see fig. [sj;, and use the condition M — 7 = {N — 7)/2. This threshold value of M/N is 
depicted by an arrow in fig. [7^. Below the phase boundary, the hairpin is stable, above the molecule is denatured. 
In fig. [qJj we additionally include the experimental results by Liphardt et al. agreeing nicely with our results. It 
is important to note, that this transition is not a phase transition in the strict statistical mechanics sense, but just 
a crossover. A true phase transition is defined as a non-analyticity of the free energy, which can only occur for an 
infinite system with long-range interactions |2Ij . The three-dimensional phase space we are considering is spanned by 
temperature, force, and salt concentration. In figs. |8] and [9] we show slices in the F-T and in the T-p plane. The phase 
boundary for the F-T plane is determined the same way as in the F-p plane, yet with varying temperature and fixed 
salt concentration. The phase boundary in the T-p plane is determined differently: heat capacity curves as a function 
of temperature are calculated for different salt concentrations. The position of the peaks in the heat capacity curves 
(one is depicted by an arrow in fig. [9^) determine the phase diagram in fig. [9]d. Therefore, slight differences between 
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Figure 6: The efTect of different parameterizations for the free energy parameters for on the denaturation curve is marginal. 
Here, the fraction of non-nested backbone bonds is plotted against the force for the P5ab hairpin and T = 298 K, p = 250mM, 
c = 2.1. The solid line is obtained with the parameters used for the P5ab hairpin in the rest of this paper gh''''^'' ( aa ) ~ 
^stack^GG^ The dotted hue is obtained by using ffh''''''(AA) = ffh'"'''(AA)' whereas the dashed hnc is for gl'-'"'^{°^) = 0. All 
three curves coincide and differ only slightly at the transition, exhibiting only marginally different transition forces, which all 
agree with the experimentally observed unfolding force within errorsLiphardt et al. . The values of the free energy parameters 
are given in the supporting material. 




Figure 7: (a) Fraction of non-nested segments of the P5ab hairpin as a function of force for different salt concentrations and 
constant temperature T — 298 K. The position of the crossover, which is defined as the point where M — 7 = {N — 7)/2, 
i.e. M/N = 0.56 (indicated by the arrow), determines the phase diagram, (b) Phase diagram of the P5ab hairpin in the F-p 
plane for different temperatures T = 298 K, 300 K, 320 K. Below the curve the RNA is in the hairpin phase, above the RNA 
is denatured. The symbol at p = 250 mM, F = 13.3 pN, and T — 298 K denotes the experimental data by Liphardt et al. [5] 
and coincides with our prediction. 



the phase diagrams in figs. [7j [8]on the one hand and fig. |9]on the other hand may arise. 

We observe that for large salt concentrations, the denaturation forces and temperatures are rather independent of 
the salt concentration, see figs, [t] and [9] Only when the Debye screening length is of the order of the typical 
length scale of RNA, which is the case for p < 100 mM, a marked dependence on the salt concentration is observed. 



VII. CONCLUSIONS 



We construct a theory for RNA folding and melting that includes the effects of monovalent salt, loop entropy, and 
stretching forces. Our theory is based on salt and temperature dependent modifications of the free energies of RNA 
helices and loops that include electrostatic interactions on the linear Debye-Hiickel level - augmented by Manning 
condensation - and conformational fluctuation effects via the asymptotic, non-linear expression for the entropy of 
loop formation. Decreasing salt concentration is shown to generally destabilize RNA folds and to lower denaturation 
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Figure 8: (a) Fraction of non-nested segments of the P5ab hairpin as a function of force for different temperatures and constant 
salt concentration p = 250 mM. The position of the crossover (arrow, M/N — 0.56) determines the phase boundary. With 
increasing temperature a decrease of the denaturation force is observed. Above the melting temperature Tm « 358 K the 
molecule is always in the denatured state, (b) Phase diagram of the P5ab hairpin in the F-T plane. Below the curve the RNA 
is in the native hairpin phase, above the RNA is denatured. The symbol denotes experimental values [9]. 




Figure 9: (a) Heat capacity curves for different salt concentrations as a function of temperature. The peak position moves to 
higher temperatures with increasing salt concentration. The positions of the peaks, denoted exemplarily for the curve with 
p = 20 mM by the arrow, determine the phase diagram, (b) Phase diagram of the P5ab hairpin in the T-p plane. Below the 
curve the RNA is in the native hairpin phase, above the RNA is denatured. 



temperatures and forces. The predictions are in good agreement with experimental data as shown for two different 
scenarios, namely the heat capacity curves for the thermal denaturation of tRNA-phe and the response of the P5ab 
RNA hairpin to an external pulling force. 

Due to the usage of the linear Debye-Hiickel approximation in conjunction with the Manning condensation concept, 
our approach is limited to monovalent salt and neglects ion-specific effects. Electrostatic nonlinear and correlation 
effects could in principle be taken into account by more advanced modeling using variational approaches 45J, while 
ion-specific effects could be straightforwardly included using effective interactions between different ions and RNA 
bases |60j . More complex phenomena involving multivalent ions such as Mg^'*' could in principle be modeled by 
allowing for a few tertiary contacts, which is left for future studies. 

We find that for a proper description of RNA melting curves, correct modeling of the loop entropy is crucial. A 
non-zero loop exponent leads to an increased cooperativity of the melting transition and thus makes the heat capacity 
curve narrower in good agreement with experimental results. We conclude that for a correct description of RNA 
denaturation thermodynamics, both loop entropy and salt effects are important and should be included in standard 
structure and melting curve prediction software. 
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A. MINIMUM FREE ENERGY STRUCTURE PREDICTION 

The partition function involves the enumeration and contribution of all possible secondary structures and can be 

calculated exactly by the recursion relation eq. (11). From the partition function the free energy of an RNA 
strand ranging from base i through j with M non-nested backbone bonds can be calculated 

J-M = -kBTlnQ5. (SI) 

Another important quantity is the minimum free energy £ (mfc) associated with the ground state structure, also 
called mfe structure. In analogy to the partition function we define the mfe of a substrand ranging from base i through 
j with M non- nested backbone segments as RNA structure prediction relies on the assumption that the native - 
experimentally observed - structure is given by the mfe structure. Following the same idea of the recursive calculation 
of the partition function, see eq. (11), we can determine the minimum free energy of a sequence by replacing the sums 
by the min operator and obtain the recursion relations 

£^^\=min\£^.,^j^,n J£^^^^ (S2a) 

£k,j+i is the free energy of the base pair (k,j + 1). From the three-dimensional array the mfe structure can now be 
obtained by a recursive backtracking algorithm. The following pseudo code initiates the backtracking 

ip_buffer = oo ; 
for( M = 0; M < N; M++ ){ 
if( £^j^ < f_buffer ){ 
5_buffer = £^j^; 
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+49-89-28914337, Fax; +49-89-28914642, E-mail: einert@ph.tum.de 
t E-mail: netz@ph.tum.de 



2 



M_min = M; 

} 

> 

backtrack(0, N, M_miii) ; 

The function backtrack(i, j, M) performs the next backtracking step and calls itself recursivly. Its pseudocode 
reads 

backtrack(i, j, M){ 

/* Check if (i, j, M) is an allowed triple, this includes e.g. 
M < j-i 

if M == then base i and base j must be complemetary 

' j' ^5 allowed, then S^j == oo */ 

if( < oo)-C 
if (M == ){ 

/* i.e. base i and j are paired => add to list of base pairs */ 
add._to_list_of _base_pairs (i , j); 
f_buffer = oo; 

/* Determine the size m_min of the loop closed by the pair (i,j) */ 
for( m = 0; m < j-i; in++) { 
if( < ^-buffer ){ 

£:_buffer = S^,j_,; 
in_niin = m ; 

} 

} 

/* Perform the next backtracking step */ 
backtrack ( i + 1 , in_iiiin) ; 

} 

else if( M > ){ 
£_buffer = oo ; 
for ( k = 0; k < j ; k++ ) { 

/* Split the structure into two structures . The left structure 

ranges from i through k-1 and has M-1 non-nested backbone bonds. 
The right structure ranges from k through j and has zero 
non-nested backbone bonds, i.e. it is either a single 

base (k==j) or is a closed structure terminated by the pair (k,j) */ 

if( Ef'f_\ + < £:_buffer ){ 

f .buffer = 's^,-_\ + £Ij; 
k_min = k; 

} 

} 

/* Perform next backtracking step on the left substructure */ 
backtrack(i, k-1, M-1); 
/* Perform next backtracking step on the right substructure */ 
if( k ^ j ){ 

backtrack(k, j, 0); 

> 

} 
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B. INTERPOLATION FORMULAS 



In order to make our results more user-friendly we give interpolation formulas for the salt corrections which involve 

hypcrgcomctric and Bcsscl functions as those might not be available for every user. 

The salt correction for the free energy of loops contains two generalized hypergeometric functions 



ln(«;m/ss) - ln(7r/2) + 7 ^1^2 |^l/2, ^3^2^ ' 



(S3) 



The sum of the two hypergeometric functions is well approximated by interpolating between the two asymptotic 
expansions for small and large argument 



y I , /r, I I \ f y \ \ . ^ fy\ t:. I I ^\ i ( y 

^ 2 

,,3 „,2 



2 



(27r)6 ^ ' \ (27r) 



(S4) 



The salt correction of the binding free energy of a base pair contains a modified Bessel function of the second kind, 
eq. (8), 

^r> = 5h"(p)-5r(lM), (S5) 

with 

gDH(p) = 2kBTTll^JMKd) ■ (S6) 
Eq. (S5) is well approximated by the heuristic formula 

gr/h « k^TrikJ, ( + , (S7) 



1 + fesK 

with the constants h = 0.0315171, 62 = 35.0754, 63 = 1.62292, 64 = 1 nm, 65 = 4.26381 nm 
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C. FREE ENERGY PARAMETERS 



The enthalpy h and entropy s parameters used in our calculations are taken from reference [1, 2]. For instance, the 
entries in the row UA and the column GC in the following table. /luA.GC find .sua.gCj give the enthalpy and entropy 
contribution due to the stacking of the two neighboring base pairs UA and GC, where U and C are located at the 
5'-end. The two bottom rows contain the initiation and termination contribution to helices and loops. For instance, 
the total free enthalpy of the triple helix terminated by a hairpin of length m = 6, 3,IqquI locyp, at 1 M NaCl is given 
by 

g = + g^'^^k ^ gterm ^ ginit ^ gconl (gg^ 

= h^'' - Tsjf 

+ ^CG,GC + hccAV — T'('SCG,GC + SGC.Au) 
+ /iff™ - Tsjf ™ 

+ h^^ - Tsj"'* 
-kBTlnm-'^ . 



Enthalpy h / (kcal/mol) Entropy s / (10"-^kcal/(mol K)) 





AU 


UA 


CG 


GC 


GU 


UG 




AU 


UA 


CG 


GC 


GU 


UG 


AU 


-6.82 


-9.38 


-11.40 


-10.48 


-3.21 


-8.81 


AU 


-19.0 


-26.7 


-29.5 


-27.1 


-8.6 


-24.0 


UA 


-7.69 


-6.82 


-12.44 


-10.44 


-6.99 


-12.83 


UA 


-20.5 


-19.0 


-32.5 


-26.9 


-19.3 


-37.3 


CG 


-10.44 


-10.48 


-13.39 


-10.64 


-5.61 


-12.11 


CG 


-26.9 


-27.1 


-32.7 


-26.7 


-13.5 


-32.2 


GC 


-12.40 


-11.40 


-14.88 


-13.39 


-8.33 


-12.59 


GC 


-32.5 


-29.5 


-36.9 


-32.7 


-21.9 


-32.5 


GU 


-12.83 


-8.81 


-12.59 


-12.11 


-13.47 


-14.59 


GU 


-37.3 


-24.0 


-32.5 


-32.2 


-44.9 


-51.2 


UG 


-6.99 


-3.21 


-8.33 


-5.61 


-9.26 


-13.47 


UG 


-19.3 


-8.6 


-21.9 


-13.5 


-30.8 


-44.9 


, init/term 

\ ./ 


3.72 


3.72 


0.00 


0.00 


3.72 


3.72 


illit /term 
■/ 


10.5 


10.5 


0.0 


0.0 


10.5 


10.5 




1.68 


1.68 


1.68 


1.68 


1.68 


1.68 


oinit 


-0.7 


-0.7 


-0.7 


-0.7 


-0.7 


-0.7 



D. SEQUENCES 



The sequence of yeast tRNA-phe reads [3] 
GCGGAUUUAG CUCAGUUGGG AGAGCGCCAG ACUGAAGAUC UGGAGGUCCU GUGUUCGAUC CACAGAAUUC GCACCA. 

The sequence of the P5ab hairpin reads [4] 
ACAGCCGUUC AGUACCAAGU CUCAGGGGAA ACUUUGAGAU GGGGUGCUGA CGGACA. 
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